問題描述
如果請求的數據有時被壓縮,有時不被壓縮,如何使用 pycurl? (how to use pycurl if requested data is sometimes gzipped, sometimes not?)
I'm doing this to fetch some data:
c = pycurl.Curl()
c.setopt(pycurl.ENCODING, 'gzip')
c.setopt(pycurl.URL, url)
c.setopt(pycurl.TIMEOUT, 10)
c.setopt(pycurl.FOLLOWLOCATION, True)
xml = StringIO()
c.setopt(pycurl.WRITEFUNCTION, xml.write )
c.perform()
c.close()
My urls are typically of this sort:
http://host/path/to/resource‑foo.xml
Usually I get back 302 pointing to:
http://archive‑host/path/to/resource‑foo.xml.gz
Given that I have set FOLLOWLOCATION, and ENCODING gzip, everything works great.
The problem is, sometimes I have a URL which does not result in a redirect to a gzipped resource. When this happens, c.perform()
throws this error:
pycurl.error: (61, 'Error while processing content unencoding: invalid block type')
Which suggests to me that pycurl is trying to gunzip a resource that is not gzipped.
Is there some way I can instruct pycurl to figure out the response encoding, and gunzip or not as appropriate? I have played around with using different values for ENCODING
, but so far no beans.
The pycurl docs seems to be a little lacking. :/
thx!
‑‑‑‑‑
參考解法
方法 1:
If worst comes to worst, you could omit the ENCODING 'gzip', set HTTPHEADER to {'Accept‑Encoding' : 'gzip'}, check the response headers for "Content‑Encoding: gzip" and if it's present, gunzip the response yourself.
(by billc、Piskvor left the building)